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Oncogenic activation has proven to be a valuable genetic marker for the identification of novel vertebrate 
genes [Varmus, H., Science 240 , 1427-1435 (1988)]. The ras gene family, certain tyrosine protein kinases (sac 
gene family, abl, trk, met, ref) and transcription factors (fosjun, erbA) are just some of the best known examples. 
Although the precise function of these genes remains to be elucidated, their capacity to induce neoplasia 

5 strongly suggests that they play critical roles in the control of signal transduction processes [Bishop, J.M., Sci- 
ence 235, 305-311 (1987)]. 

The property of oncogenic activation has been used to isolate a number of novel human genes, one of which 
(vav) has been recently characterized at the molecular level. The vav gene was first identified when it became 
activated as an oncogene by a fortuitous rearrangement during the course of gene transfer assays [Katzav, S. 

10 et al., EMBO J. 8, 2283-2290 (1989)]. Molecular characterization of the human vav oncogene revealed a 
molecule capable of coding for a 797 amino acid polypeptide whose amino-terminus had been replaced by 
spurious sequences derived from the bacterial Tn5 gene used to confer G41 8 resistance to the transfected cells 
[Katzav, S. et al„ supra]. The rest of the vav oncogene product contains a series of structural motifs reminiscent 
of those found in certain transcription factors, including a highly acidic amino-terminal region and a cystein-rich 

15 domain that depicts two putative metal binding structures [Johnson, P.F. et al., Annu. Rev. Biochem. 58, 799- 
839(1989)]. 

The most intriguing feature of the vav gene is its pattern of expression. Analysis of vav gene transcripts in 
a series of human cell lines indicated that the vav gene is specifically expressed in cells of hematopoietic origin 
[Katzav, S. et al., supra] . No vav gene expression could be observed in either epithelial, mesenchymal or 
20 neuroectodermal cells. Interestingly, lymphoid, myeloid and erythroid cell lines contained comparable levels 
of vav gene transcripts. Similar results were obtained when normal human cells were examined, including B 
and T lymphocytes, macrophages and platelets [Katzav, S. et al., supra] . These observations suggest that the 
vav gene may play a basic role in hematopoiesis that is not influenced by differentiation programs. 

It would be useful to isolate oncogenes from other mammalian species related to the human vav oncogene 
25 in order to more easily study the role of this protein in oncogenesis. 

The present invention concerns an isolated nucleic acid molecule comprising a nucleic acid sequence cod- 
ing for all or part of a mouse vav proto-oncogene protein. Preferably, the nucleic acid molecule is a DNA 
(deoxyribonucleic acid) molecule, and the nucleic acid sequence is a DNA sequence. Further preferred is a 
DNA sequence having all or part of the nucleotide sequence substantially as shown in Figure 2 [SEQ. ID NO: 
30 1]. 

The present invention further concerns expression vectors comprising a DNA sequence coding for all or 
part of a mouse vav proto-oncogene protein. 

The present invention additionally concerns prokaryotic or eukaryotic host cells containing an expression 
vector which comprises a DNA sequence coding for all or part of a mouse vav proto-oncogene protein. 
35 The present invention also concerns methods for detecting nucleic acid sequences coding for all or part 

of a mouse vav proto-oncogene protein or related nucleic acid sequences. 

The present invention further concerns polypeptide molecules comprising all or part of a mouse vav pro- 
to-oncogene protein. 

Figure 1 shows a schematic diagram of a nucleotide sequence analysis of a mouse vav proto-oncogene 

40 cDNA clone. Untranslated 5' and 3' sequences are represented by a thin bar. Coding sequences are depicted 
by a thicker box and are flanked by the initiator (ATG) and terminator (TGA) codons. Highlighted domains 
include the leucine-rich domain (shaded box); the acidic region (black box) two proline-rich stretches (open 
box); two putative nuclear localization signals (left hatched box) and a cystein-rich region (right hatched box). 
Figure 2 shows the nucleotide [SEQ. ID NO: 1] and deduced amino acid [SEQ. ID NO: 2] sequence of the 

45 2793 bp insert of pMB24. The sequences of the flanking EcoRI linkers have been omitted. Numbers to the right 
of the sequence indicate nucleotide numbers and those to the left amino acid numbers. Underlined sequences 
correspond to those structures highlighted in (A). The cystein-rich domain has been boxed. Cysteine and his- 
tidine residues corresponding to the putative zinc finger-like structures (Cys-X 2 -Cys-X 13 -Cys-X2-Cys and His- 
X^Cys-Xe-Cys-X^His) have been shaded. A putative protein kinase A phospho-rylation site is underlined by 

so a crosshatched box and a putative polyadenylation signal by a wavy line. 

Figure 3 shows detection of mouse vav gene transcripts. Two micrograms of poly A-selected RNA isolated 
from adult mouse tissue including (a) lung; (b) heart; (c) testes; (d) muscle; (e) intestine; (f) brain; (g) kidney; 
(h) spleen; (i) ovaries; (j) liver; and from murine cell lines including (k) NIH3T3 fibroblasts; (I) A20 B-lymphocyte 
and (m) MOPC 31 5 plasmacytoma cells were submitted to Northern transfer analysis. Nitrocellulose filters were 

55 hybridized under stringent conditions (50% v/v formamide, 42°C) to 5 x 10 7 cpm of a pPJ-labeled nick-trans- 
lated DNA fragment corresponding to the entire insert of pMB24. The hybridized filter was exposed to Kodak 
X-OMAT film for 24 hours at -70°C with the help of intensifier screens. S. cerevisiae 23S and 18S ribosomal 
RNAs were used as size markers. The migration of the 3 kb mouse vav proto-oncogene transcript is indicated 
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by a thick arrow. 

Figure 4 shows identification of pQS™ as a mouse vav proto-oncogene product. PS methionine]-labeled 
cell extracts of (A) PAb 280, a mouse B-cell hybridoma; (B) PMM8, a mouse T-cell hybridoma; (C) NIH3T3 cells 
and (D) NIH3T3 cells transfected with pJC13, a pMEX-derived expression plasmid carrying a mouse vav pro- 

5 to-oncogene cDNA clone, were immunoprecipitated with (a) preimmune rabbit serum or (b) an antiserum raised 
against a peptide corresponding to a hydrophilic domain (amino acid residues 576-589) of a mouse vav protein 
either in the absence (-) or in the presence (+) of 1 0 ng of competing peptide. Immunoprecipitates were loaded 
onto 8% SDS-polyacrylamide gels. Electrophoresed gels were exposed to Kodak X-OMAT film for 2 days at 
-70°C in the presence of intensifier screens. The migration of p95 vav is indicated by a thick arrow. The migration 

10 of co-electrophoresed molecular weight standards including myosin (200,000), phosphorylase B (92,500) and 
bovine serum albumin (69,000) is also indicated. 

Figure 5 shows the mechanism of activation of mouse and human vav oncogenes. Schematic represen- 
tation of pMEX-derived expression vectors carrying normal and mutated vavcDNA clones. Symbols are those 
shown in Figure 1A. The presence of an MSV-LTR in each of these plasmids is also indicated. Bacterial Tn5- 

15 derived sequences present in the pSK27 plasmid containing a human vav oncogene [Katzav, S. et al., supra] 
are indicated by a dashed box. The [atg] symbol represent an in-frame translational initiator used by pJC12 
and pJC7. This triplet codes for the methionine residues underlined in Figure 2. The right column indicates the 
relative transforming activity of these plasmids (expressed as focus forming units per microgram of linearized 
plasmid DNA) when tested in gene transfer assays using NIH3T3 cells as recipients. 

20 Figure 6 shows that overexpression of wild type p95™ protein can induce morphologic transformation of 

NIH3T3 cells. PS methionine]labeled cell extracts of (A) NIH3T3 cells; (B) NIH3T3 cells transformed by pJC1 3, 
an expression plasmid containing a full vav cDNA clone; (C) NIH3T3 cells transformed by pJC7, an expression 
plasmid containing a vav cDNA clone coding for a protein lacking the amino terminal domain (amino acid resi- 
dues 1 to 65); and (D) NIH3T3 cells transformed by pSK27, an expression plasmid containing the human vav 

25 oncogene were immunoprecipitated with (a) normal rabbit serum and (b,c) a rabbit antiserum raised against a 
vav peptide either (b) in the presence or (c) in the absence of 10 \ig of competing peptide. Immunoprecipitates 
were analyzed as indicated in the legend to Figure 4. The migration of the wild type pgs*®* and the truncated 
p88 vav proteins is indicated by thick arrows. Co-electrophoresed molecular weight markers are those described 
in Figure 4 and ovalbumin (46,000). 

30 Figure 7 shows the identification and mechanism of activation of a second human vav oncogene. ON As 

(10 \ig) isolated from (a) a nude mouse tumor induced by NIH3T3 cells that contain a human vav oncogene 
(Katzav, S. et al., supra) ; (b,c) nude mouse tumors induced by (b) second cycle- and (c) third cycle-transfor- 
mants derived from transfection of NIH3T3 cells with human breast carcinoma DNA and (d) T24 human cells, 
were digested with Sac I and submitted to Southern transfer analysis. Hybridization was conducted for 48 hours 

35 under stringent conditions (50% v/v formamide, 42°C) using 5 x 1 0 7 cpm of P 2 P}-labeled nick-translated probes 
corresponding to (A) a 180 bp EcoRI-Hinc II and (B) a 575 bp Sac l-Pst I DNA fragment of pSK65, a Blues- 
cript-derived plasmid containing a human vav proto-oncogene cDNA clone [Katzav, S. et al., supra .]. Filters 
were exposed to Kodak X-OMAT film at -70°C for (A) 1 0 days or (B) 3 days in the presence of intensifier screens. 
Co-electrophoresed X Hind III DNA fragments were used as size markers. The migration of the genomic (A) 4 

40 kbp and (B) 7 kbp Sac I DNAfragments is indicated by arrows. The precise location of the pSK65-derived probes 
is indicated in the upper diagram. The vertical arrow indicates the breakpoint caused by the genomic rearrange- 
ment that activated the previously characterized human vav oncogene [Katzav, S. et al., supra]. 

The present invention concerns an isolated nucleic acid molecule comprising a nucleic acid sequence cod- 
ing for all or part of a mouse vav proto-oncogene protein. Preferably, the nucleic acid molecule is a DNA 

45 molecule and the nucleic acid sequence is a DNA sequence. Further preferred is a DNA sequence having all 
or part of the nucleotide sequence substantially as shown in Figure 2 [SEQ. ID NO: 1], or a DNA sequence 
complementary to this DNA sequence. In the case of a nucleotide sequence (e.g., a DNA sequence) coding 
for part of a mouse vav proto-oncogene protein, it is preferred that the nucleotide sequence be at least about 
15 nucleotides in length. 

so The DNA sequences of the present invention can be isolated from a variety of sources, although the pre- 

sently preferred sequence has been isolated from two different mouse cDNA libraries. The exact amino acid 
sequence of the polypeptide molecule produced will vary with the initial DNA sequence. 

The DNA sequences of the present invention can be obtained using various methods well-known to those 
of ordinary skill in the art. At least three alternative principal methods may be employed: 
55 (1) the isolation of a double-stranded DNA sequence from genomic DNA or complementary DNA (cDNA) 

which contains the sequence; 

(2) the chemical synthesis of the DNA sequence; and 

(3) the synthesis of the DNA sequence by polymerase chain reaction (PCR). 
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In the first approach, a genomic or cDNA library can be screened in order to identify a DNA sequence coding 
for all or part of a mouse vav proto-oncogene protein. For example, a mouse cDNA library can be screened in 
order to identify a DNA sequence coding for all or part of a mouse vav proto-oncogene protein. Various mouse 
cDNA libraries, for example, those derived from WEHI-3 (ATCC TIB 68) cells and those derived from EL-4 
5 (ATCC TIB 39) cells can be employed. Various techniques can be used to screen the genomic DNA or cDNA 
libraries. 

For example, labeled single stranded DNA probe sequences duplicating a sequence present in the target 
genomic DNA or cDNA coding for all or part of a mouse vav proto-oncogene protein can be employed in 
DNA/DNA hybridization procedures carried out on cloned copies of the genomic DNA or cDNA which have been 
10 denatured to single stranded form. 

A genomic DNA or cDNA library can also be screened for a genomic DNA or cDNA coding for all or part 
of a mouse vav proto-oncogene protein using immunoblotting techniques. 

In one typical screening method suitable for either immunoblotting or hybridization techniques, the genomic 
DNA library, which is usually contained in a vector such as XGT1 1 , or cDNA library is first spread out on agarose 
15 plates, and then the clones are transferred to filter membranes, for example, nitrocellulose membranes. A DNA 
probe can then be hybridized or an antibody can then be bound to the clones to identify those clones containing 
the genomic DNA or cDNA coding for all or part of a mouse vav proto-oncogene protein. 

In the second approach, the DNA sequence of the present invention coding for ail or part of a mouse vav 
proto-oncogene protein can be chemically synthesized. For example, the DNA sequence coding for a mouse 
20 vav proto-oncogene protein can be synthesized as a series of 1 00 base oligonucleotides that can then be 
sequentially ligated (via appropriate terminal restriction sites) so as to form the correct linear sequence of nuc- 
leotides. 

In the third approach, the DNA sequences of the present invention coding for ail or part of a mouse vav 
proto-oncogene protein can be synthesized using PCR. Briefly, pairs of synthetic DNA oligonucleotides at least 

25 15 bases in length (PCR primers) that hybridize to opposite strands of the target DNA sequence are used to 
enzymatically amplify the intervening region of DNA on the target sequence. Repeated cycles of heat denatu- 
ration of the template, annealing of the primers and extension of the 3'-termini of the annealed primers with a 
DNA polymerase results in amplification of the segment defined by the 5' ends of the PCR primers. See , U.S. 
Patent Nos. 4,683,195 and 4,683,202. 

30 The DNA sequences of the present invention can be used in a variety of ways in accordance with the pre- 

sent invention. For example, they can be used as DNA probes to screen other cDNA and genomic DNA libraries 
so as to select by hybridization other DNA sequences that code for proteins related to a mouse vav proto-on- 
cogene protein. In addition, the DNA sequences of the present invention coding for all or part of a mouse vav 
proto-oncogene protein can be used as DNA probes to screen other cDNA and genomic DNA libraries to select 

35 by hybridization DNA sequences that code for the vav proto-oncogene protein molecules from organisms other 
than mice. 

The DNA sequences of the present invention coding for all or part of a mouse vav proto-oncogene protein 
can also be modified (i.e., mutated) to prepare various mutations. Such mutations may be either degenerate, 
i.e., the mutation does not change the amino acid sequence encoded by the mutated codon, or non-degenerate, 

40 i.e., the mutation changes the amino acid sequence encoded by the mutated codon. These modified DNA sequ- 
ences may be prepared, for example, by mutating a mouse vav proto-oncogene protein DNA sequence so that 
the mutation results in the deletion, substitution, insertion, inversion or addition of one or more amino acids in 
the encoded polypeptide using various methods known in the art For example, the methods of site-directed 
mutagenesis described in Taylor, J. W. et al. v Nucl. Acids Res. 13, 8749-8764 (1985) and Kunkel, J. A., Proc. 

45 Natl. Acad. Sci. USA 82, 482-492 (1 985) may be employed. In addition, kits for site-directed mutagenesis may 
be purchased from commercial vendors. For example, a kit for performing site-directed mutagenesis may be 
purchased from Amersham Corp. (Arlington Heights, IL). Both degenerate and non-degenerate mutations may 
be advantageous in producing or using the polypeptides of the present invention. For example, these mutations 
may permit higher levels of production, easier purification, or provide additional restriction endonuclease rec- 

50 ognition sites. All such modified DNAs (and the encoded polypeptide molecules) are included within the scope 
of the present invention. 

As used in the present application, the term "modified", when referring to a nucleotide or polypeptide sequ- 
ence, means a nucleotide or polypeptide sequence which differs from the wild-type sequence found in nature. 
The present invention further concerns expression vectors comprising a DNA sequence coding for all or 
55 part of a mouse vav proto-oncogene protein. The expression vectors preferably contain all or part of the DNA 
sequence having the nucleotide sequence substantially as shown in Figure 2 [SEQ. ID NO: 1]. Further preferred 
are expression vectors comprising one or more regulatory DNA sequences operativeiy linked to the DNA sequ- 
ence coding for all or part of a mouse vav proto-oncogene protein. As used in this context, the term "operativeiy 
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linked" means that the regulatory DNA sequences are capable of directing the replication and/or the expression 
of the DNA sequence coding for all or part of a mouse vav proto-oncogene protein. 

Expression vectors of utility in the present invention are often in the form of "plasmids", which refer to cir- 
cular double stranded DNAs which, in their vector form, are not bound to the chromosome. However, the inven- 

5 tion is intended to include such other forms of expression vectors which serve equivalent functions and which 
become known in the art subsequently hereto. 

Expression vectors useful in the present invention typically contain an origin of replication, a promoter 
located in front of (i.e., upstream of) the DNA sequence and followed by the DNA sequence coding for all or 
part of a mouse vav proto-oncogene protein, transcription termination sequences and the remaining vector. 

10 The expression vectors may also include other DNA sequences known in the art, for example, stability leader 
sequences which provide for stability of the expression product, secretory leader sequences which provide for 
secretion of the expression product, sequences which allow expression of the structural gene to be modulated 
(e.g., by the presence or absence of nutrients or other inducers in the growth medium), marking sequences 
which are capable of providing phenotypic selection in transformed host cells, and sequences which provide 

15 sites for cleavage by restriction endonucleases. The characteristics of the actual expression vector used must 
be compatible with the host cell which is to be employed. For example, when cloning in a mmmalian ceil system, 
the expression vector should contain promoters isolated from the genome of mammalian cells, (e.g., mouse 
metallothionien promoter), or from viruses that grow in these cells (e.g., vaccinia virus 7.5 K promoter). An exp- 
ression vector as contemplated by the present invention is at least capable of directing the replication, and pref- 

20 erably the expression, of the DNA sequences of the present invention. Suitable origins of replication include, 
for example, the Ori origin of replication from the ColE1 derivative of pMB1. Suitable promoters include, for 
example, the long terminal repeats of the Moloney sarcoma virus, the Rous sarcoma virus and the mouse mam- 
mary tumor virus, as well as the early regions of Simian virus 40 and the polyoma virus. As selectable markers, 
the bacterial genes encoding resistance to the antibodies neomycin and G418 (neo) puromycin (pur) or hyg- 

25 romycin (hygro), or mammalian genes encoding thymidine kinase can be employed. All of these materials are 
known in the art and are commercially available. 

Particularly preferred is the expression vector designated pMB24, described herein below, which contains 
the DNA sequence coding for a mouse vav proto-oncogene protein, or expression vectors with the identifying 
characteristics of pMB24. 

30 E. coli host cells (strain XL1-Blue) containing the plasmid pMB24 were deposited with the American Type 

Culture Collection, Rockville, Maryland on January 23, 1991 under the Budapest Treaty and assigned ATCC 
accession no. 68516. pMB24 contains a cDNA clone of the mouse vav proto-oncogene encompassing the 
entire coding sequence. 

Suitable expression vectors containing the desired coding and control sequences may be constructed 
35 using standard recombinant DNA techniques known in the art, many of which are described in Maniatis, T. et 
aU Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1982). 

The present invention additionally concerns host cells containing an expression vector which comprises a 
DNA sequence coding for all or part of a mouse vav proto-oncogene protein. The host cells preferably contain 
an expression vector which comprises all or part of the DNA sequence having the nucleotide sequence sub- 
40 stantially as shown in Figure 2 [SEQ. ID NO: 1]. Further preferred are host cells containing an expression vector 
comprising one or more regulatory DNA sequences capable of directing the replication and/or the expression 
of and operatively linked to a DNA sequence coding for all or part of a mouse vav proto-oncogene protein. Suit- 
able host cells include both prokaryotic and eukaryotic cells. Suitable prokaryotic host cells include, for 
example, various strains of E. coli such as DH5, C600 and LL1 . Suitable eukaryotic host cells include, for 
45 example, mouse NIH3T3 and BALB3T3 cells, rat Rat-2 cells, monkey COS cells, human Hela cells and hamster 
CHO cells. 

Preferred as host cells are mouse NIH3T3 cells. 

Expression vectors may be introduced into host cells by various methods known in the art. For example, 
transfection of host cells with expression vectors can be carried out by the calcium phosphate precipitation 
so method. However, other methods for introducing expression vectors into host cells, for example, electropor- 
ation, biolistic fusion, liposomal fusion, nuclear injection and viral or phage infection can also be employed. 

Once an expression vector has been introduced into an appropriate host cell, the host cell can be cultured 
under conditions permitting expression of large amounts of the desired polypeptide, in this case a polypeptide 
molecule comprising all or part of a mouse vav proto-oncogene protein. Such polypeptides are useful in the 
55 study of the characteristics of a mouse vav proto-oncogene protein, for example, its role in oncogenesis. Such 
polypeptides can also be used to identify potential anti-cancer drugs. For example, a compound which is able 
to bind to or inhibit the function of the vav proto-oncogene may be an effective cancer chemotherapeutic agent. 

Host ceils containing an expression vector which contains a DNA sequence coding for all or part of a mouse 
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vav proto-oncogene protein may be identified by one or more of the following four general approaches: (a) DNA- 
DNA hybridizaiton; (b) the presence or absence of markergene functions; (c) assessing the level of transcription 
as measured by the production of mouse vav proto-oncogene protein mRNA transcripts in the host cell; and 
(d) detection of the gene product immunologically. 

In the first approach, the presence of a DNA sequence coding for all or part of a mouse vav proto-oncogene 
protein can be detected by DNA-DNA or RNA-DNA hybridization using probes complementary to the DNA sequ- 
ence. 

In the second approach, the recombinant expression vector host system can be identified and selected 
based upon the presence or absence of certain marker gene function (e.g. , thymidine kinase activity, resistance 
to antibiotics, etc.). A marker gene can be placed in the same plasmid as the DNA sequence coding for all or 
part of a mouse vav proto-oncogene protein under the regulation of the same or a different promoter used to 
regulate a mouse vav proto-oncogene protein coding sequence. Expression of the marker gene in response 
to induction or selection indicates expression of the DNA sequence coding for all or part of a mouse vav pro- 
to-oncogene protein 

In the third approach, the production of mouse vav proto-oncogene protein mRNA transcripts can be asses- 
sed by hybridization assays. For example, polyadenylated RNA can be isolated and analyzed by Northern blot- 
ting or nuclease protection assay using a probe complementary to the RNA sequence. Alternatively, the total 
nucleic acids of the host cell may be extracted and assayed for hybridization to such probes. 

In the fourth approach, the expression of all or part of a mouse vav proto-oncogene protein can be assessed 
immunologically, for example, by Western blotting. 

The DNA sequences of expression vectors, plasmids or DNA molecules of the present invention may be 
determined by various methods known in the art. For example, the dideoxy chain termination method as des- 
cribed in Sanger et al. f Proc. Natl. Acad. Sci. USA 74, 5463-5467 (1 977), or the Maxam-Gilbert method as des- 
cribed in Proc. Natl. Acad. Sci. USA 74, 560-564 (1977) may be employed. 

It should, of course, be understood that not all expression vectors and DNA regulatory sequences will func- 
tion equally well to express the DNA sequences of the present invention. Neither will all host cells function equ- 
ally well with the same expression system. However, one of ordinary skill in the art may make a selection among 
expression vectors, DNA regulatory sequences, and host cells using the guidance provided herein without 
undue experimentation and without departing from the scope of the present invention. 

The present invention further concerns a method for detecting a nucleic acid sequence coding for all or 
part of a mouse vav proto-oncogene protein or a related nucleic acid sequence comprising contacting the nuc- 
leic acid sequence with a detectable marker which binds specifically to at least a portion of the nucleic acid 
sequence, and detecting the marker so bound. The presence of bound marker indicates the presence of the 
nucleic acid sequence. Preferably, the nucleic acid sequence is a DNA sequence having all or part of the nuc- 
leotide sequence substantially as shown in Figure 2 [SEQ. ID NO: 1]. Also preferred is a method in which the 
DNA sequence is a genomic DNA sequence. A DNA sample containing the DNA sequence may be isolated 
using various methods for DNA isolation which are well-known to those of ordinary skill in the art For example, 
a genomic DNA sample may be isolated from tissue by rapidly freezing the tissue from which the DNA is to be 
isolated, crushing the tissue to produce readily digestible pieces, placing the crushed tissue in a solution of 
proteinase K and sodium dodecyl sulfate, and incubating the resulting solution until most of the cellular protein 
is degraded. The digest is then deprotenized by successive phenol/chloroform/isoamyl alcohol extractions, 
recovered by ethanol precipitation, and dried and resuspended in buffer. 

Also preferred is the method in which the nucleic acid sequence is an RNA sequence. Preferably, the RNA 
sequence is an mRNA sequence. Additionally preferred is the method in which the RNA sequence is located 
in the cells of a tissue sample. An RNA sample containing the RNA sequence may be isolated using various 
methods for RNA isolation which are well-known to those of ordinary skill in the art For example, an RNA 
sample may be isolated from cultured cells by washing the cells free of media and then lysing the cells by placing 
them in a 4 M guanidinium solution. The viscosity of the resulting solution is reduced by drawing the lysate 
through a 20 gauge needle. The RNA is then pelleted through a CsCI 2 step gradient, and the supernatant fluid 
from the gradient carefully removed to allow complete separation of the RNA, found in the pellet, from contami- 
nating DNA and protein. 

The detectable marker useful for detecting a nucleic acid sequence coding for all or part of a mouse vav 
proto-oncogene protein or a related nucleic acid sequence, may be a labeled DNA sequence, including a 
labeled cDNA sequence, having a nucleotide sequence complementary to at least a portion of the DNA sequ- 
ence coding for all or part of a mouse vav proto-oncogene protein. 

The detectable marker may also be a labeled sense or antisense RNA sequence having a nucleotide sequ- 
ence complementary to at least a portion of the DNA sequence coding for all or part of a mouse vav proto-on- 
cogene protein 
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The detectable markers of the present invention may be labeled with commonly employed radioactive 
labels, such as and 35 S, although other labels such as biotin or mercury may be employed. Various methods 
well-known to those of ordinary skill in the art may be used to label the detectable markers. For example, DNA 
sequences and RNA sequences may be labeled with ^P or 35 S using the random primer method. 

5 Once a suitable detectable marker has been obtained, various methods well-known to those of ordinary 

skill in the art may be employed for contacting the detectable marker with the sample of interest. For example, 
DNA-DNA, RNA-RNA and DNA-RNA hybridizations may be performed using standard procedures known in 
the art. In a typical DNA-DNA hybridization procedure for detecting DNA sequences coding for all or part of a 
mouse vav proto-oncogene protein in genomic DNA, the genomic DNA is first isolated using known methods, 

10 and then digested with one or more restriction enzymes. The resulting DNA fragments are separated on agar- 
ose gels and denatured in situ. After prehy bridization to reduce nonspecific hybridization, a radiolabeled nucleic 
acid probe is hybridized to the immobilized DNA fragments. The filter is then washed to remove unbound or 
weakly bound probe, and is then auto-radiographed to identify the DNA fragments that have hybridized with 
the probe. 

15 The presence of bound detectable marker may be detected using various methods well-known to those of 

ordinary skill in the art. For example, if the detectable marker is radioactively labeled, autoradiography may be 
employed. Depending on the label employed, other detection methods such as spectrophotometry may also 
be used. 

It should be understood that nucleic acid sequences related to nucleic acid sequences coding for all or part 
20 of squalene synthetase can also be detected using the methods described herein. For example, a DNA probe 
based on conserved regions of a mouse vav proto-oncogene protein (e.g., the helix-loop region, leucine zipper 
domain and cystein-rich [zinc-finger] domain) can be used to detect and isolate related DNA sequences (e.g., 
a DNA sequence coding for a rat vav proto-oncogene protein ). All such methods are included within the scope 
of the present invention. 

25 As used in the present application and in this context, the term "related" means a nucleic acid sequence 

which is able to hybridize to an oligonucleotide probe based on the nucleotide sequence of a mouse vav pro- 
to-oncogene protein. 

The present invention further concerns polypeptide molecules comprising all or part of a mouse vav pro- 
to-oncogene protein, said polypeptide molecules preferably having ail or part of the amino acid sequence sub- 

30 stantially as shown in Figure 2 [SEQ. ID NO: 2]. 

The polypeptides of the present invention may be obtained by synthetic means, i.e., chemical synthesis 
of the polypeptide from its component amino acids, by methods known to those of ordinary skill in the art. For 
example, the solid phase procedure described by Houghton et al., Proc. Natl. Acad. Sci. 82, 5135 (1985) may 
be employed. It is preferred that the polypeptides be obtained by production in prokaryotic or eukaryotic host 

35 cells expressing a DNA sequence coding for all or part of a mouse vav proto-oncogene protein, or by in vitro 
translation of the mRNA encoded by a DNA sequence coding for all or part of a mouse vav proto-oncogene 
protein. For example, the DNA sequence of Figure 2 [SEQ. ID NO: 1] may be synthesized using PCR as des- 
cribed above and inserted into a suitable expression vector, which in turn may be used to transform a suitable 
host cell. The recombinant host cell may then be cultured to produce a mouse vav proto-oncogene protein. 

40 Techniques for the production of polypeptides by these means are known in the art, and are described herein. 

The polypeptides produced in this manner may then be isolated and purified to some degree using various 
protein purification techniques. For example, chromatographic procedures such as ion exchange 
chromatography, gel filtration chromatography and immunoaffinity chromatography may be employed. 

The polypeptides of the present invention may be used in a wide variety of ways. For example, the polypep- 

45 tides may be used to prepare in a known manner polyclonal or monoclonal antibodies capable of binding the 
polypeptides. These antibodies may in turn be used for the detection of the polypeptides of the present invention 
in a sample, for example, a cell sample, using immunoassay techniques, for example, radioimmunoassay or 
enzyme immunoassay. The antibodies may also be used in affinity chromatography for purifying the polypep- 
tides of the present invention and isolating them from various sources. 

so The polypeptides of the present invention have been defined by means of determined DNA and deduced 

amino acid sequencing. Due to the degeneracy of the genetic code, other DNA sequences which encode the 
same amino acid sequence as depicted in Figure 2 [SEQ. ID NO: 2] may be used for the production of the 
polypeptides of the present invention. In addition, it will be understood that allelic variations of these DNA and 
amino acid sequences naturally exist, or may be intentionally introduced using methods known in the art. These 

55 variations may be demonstrated by one or more amino acid differences in the overall sequence, or by deletions, 
substitutions, insertions, inversions or additions of one or more amino acids in said sequence. Such amino acid 
substitutions may be made, for example, on the basis of similarity in polarity, charge, solubility, hydrophobicity, 
hydrophilicity and/or the amphiphathic nature of the residues involved. For example, negatively charged amino 
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acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; amino 
acids with uncharged polar head groups or nonpolar head groups having similar hydrophilicity values include 
the following: leucine, isoleucine, valine; glycine, alanine; asparagine, glutamine; serine, threonine; 
phenylalanine, tyrosine. Other contemplated variations include salts and esters of the aforementioned polypep- 
5 tides, as well as precursors of the aforementioned polypeptides, for example, precursors having N-terminal sub- 
stituents such as methionine, N-formylmethionine and leader sequences. All such variations are included within 
the scope of the present invention. 

The following examples are further illustrative of the present invention. These examples are not intended 
to limit the scope of the present invention, and provide further understanding of the invention. 

10 

EXAMPLE I 

ISOLATION AND CHARACTERIZATION OF MOUSE VAV PROTO-ONCOGENE 
15 K MATERIALS AND METHODS 
1. Gene Transfer Assay 

NIH3T3 mouse cells were transfected with various amount (1 ng to 1 ng) of linearized plasmid DNA in the 
20 presence of 20 ng of carrier (calf thymus) DNA as described in Graham, F.L and van der Eb, A.J., Virology 
52, 456-467 (1975). Foci of transformed cells were scored after 14 days. To isolate G418-resistant colonies, 
NIH3T3 cells were co-transfected with 20 ng of pSVneo DNA and 1 \xq of the desired plasmid DNA as described 
in Fasano, O. et al., Mol. Cell Biol. 4, 1695-1705 (1984). 

25 2. Mouse vav cDNA clones 

cDNA libraries derived from WEHI-3 and EL-4 hematopoietic cell lines (Stratagene, La Jolla, CA) were 
screened under partially relaxed hybridization conditions (42°C in 5 X SSC [SSC = 35.06 g/l NaCI, 17.65 g/l 
Na-citrate, pH 7.0], 40% formamide, 1 X Denhardt's solution) using as a probe a ppj-labeled insert of pSK8 

30 (ATCC 41060), a plasmid containing a partial cDNA clone of the human vav proto-oncogene [Katzav, S. et al., 
supra] . Recombinant phages carrying the longest inserts (2.8 kbp) were subcloned [GIVE SOME DETAILS] in 
Bluescript KS (Stratagene) to generate pMB24 and pMB25. These mouse vav cDNA clones were submitted 
to nucleotide sequence analysis by the dideoxy chain termination method [Sanger, F. et al., Proc. Natl. Acad. 
Sci. USA 74, 5463-5467 (1977)] using double-stranded DNA, synthetic oligonucleotides as primers and mod- 

35 ified T7 DNA polymerase (Sequenase, United States Biochemicals, Cleveland, OH). 

3. Expression plasmids 

Mouse vav expression plasmids. pJC11 was generated by subcloning the entire 2.8 kbp cDNA insert of 

40 pMB24 into the EcoRI site of pMEX, a mammalian expression vector carrying a multiple cloning site flanked 
by an MSV LTR (Maloney sarcoma virus, long terminal repeat) and a SV40 polyadenylation signal [Martin-Zan- 
ca, D. et al., Mol. Cell Biol. 9, 24-33 (1989)]. Subcloning procedures involved digestion of pMB24 DNA with the 
restriction endonuclease Eco Rl, purification of the 2.8 kbp cDNA insert and religation to Eco Rl-digested pMEX 
DNA. These procedures are standard recombinant DNA techniques and are described in detail in Maniatis, T. 

45 et al., Molecular Cloning: A Laboratory Manual . Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1 982). 
The 2.8 kbp EcoRI DNA insert of pMB24 was isolated after partial digestion to avoid cleavage at the internal 
EcoRI site (nucleotides 2251-2256, Figure 2) [see SEQ. ID NO: 1], pJC12 was obtained by deleting an internal 
280 bp DNA fragment encompassed between the Sal I cleavage site present in the MCS and the unique Nru 
I site located at position 184-189 (Figure 2) [see SEQ. ID NO: 1]. This Nru I site lies just upstream of a second 

so ATG codon (nucleotides 209-21 1 , Figure 2) [see SEQ. ID NO: 1] that serves as a translations initiator in this 
plasmid. pJC17 was generated by replacing the internal 607 bp Kpn l-Stu I DNA fragment (nucleotides 992- 
1599 in Figure 2) [see SEQ. ID NO: 1] of pJC12 by a mutant DNA fragment carrying a single point mutation 
(T-»A) at position 1595 (Figure 2) [SEQ. ID NO: 1]. The mutated fragment was obtained by PCR-aided ampli- 
fication of the 607 bp Kpn l-Sru I DNA fragment using a mismatched 3' amplimer. pJC18 was generated by 

55 replacing the internal 186 bp Eco RV-Bam HI DNA fragment (nucleotide 1638-1824 in Figure 2) [see SEQ. ID 
NO: 1] of pJC12 with a mutant DNA fragment carrying a single point mutation (G-*C) at position 1738 (Figure 
2) [see SEQ. ID NO: 1]. The mutated fragment was obtained by PCR-aided amplification of the 186 bp Eco 
RV-Bam HI DNA fragment using a mismatched 5' amplimer. pJC19 was generated by replacing the internal 
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72 bp Eco RV-Nco I DNA fragment (nucleotides 1638-1800 in Figure 2) [see SEQ. ID NO: 1] of pJC12 by a 
mutant DNA fragment carrying a single point mutation (C->G) at position 1670 (Figure 2) [see SEQ. ID NO: 
1] . The mutated DNA fragment was obtained by chemical synthesis. 

Human vav gene expression plasmids. pJC7 was obtained by inserting the 2.9 kbp EcoRI cDNA clone of 
5 pSK65 [Katzav, S. et al., supra] into the unique EcoRI site of pMEX. pJC1 3 was obtained by replacing the inter- 
nal 850 bp Pst I DNA fragment of pJC7 by a similar DNA fragment generated by PCR-aided amplification using 
a 5' amplimer 

( 5 1 CCGGCTGCAGGCCACC ATGGA GCTGTGGCGCCAATGCACC3 • ) 

10 

that carried an insertion of four nucleotides (underlined). The inserted bases reconstitute the coding sequences 
presumably missing in pJC7. pJC15 was obtained by replacing the internal 552 bp Bal I fragment of pJC7 by 
a similar PCR-generated DNA fragment carrying a single point mutation (T-+C) in the triplet coding for the first 
cysteine residue of the first zinc-finger like structure (Table 2). To obtained the mutated 552 bp Bal I fragment., 

15 an 87 bp Bal l-Stu I fragment was amplified by PCR using a 3' amplimer that carried the mismatch needed to 
introduce the required T->>C mutation. This PCR-generated Ball-Stul fragment was then ligated to the wild type 
465 bp Stu l-Bal I DNA fragment obtained from pJC7. The nucleotide sequence of each of the above expression 
plasmids was verified by direct sequencing of double stranded DNA. Moreover, these expression plasmids 
directed the synthesis of the expected vav protein as determined by immunoprecipitation analysis of G418- 

20 resistant NIH3T3 cells generated by co-transfection of these plasmids with the selectable marker pSV2neo. 

4. Southern and Northern blot analysis 

High molecular weight DNA was digested to completion with appropriate restriction endonucleases, 
25 electrophoresed in 0.7% agarose gels and submitted to Southern transfer analysis as described in Southern, 
E.M., J. Mol. Biol. 98, 503-517 (1975). Total cellular RNA was extracted by the guanidium thiocyanate method 
[Chirgwin, J.M. et al., Biochemistry 18, 5294-5299 (1979)] and purified by centrifugation through cesium 
chloride. Poly(A)-containing RNA was isolated by retention on oligo(dT) columns (Collaborative Research, Bed- 
ford, MA). Total RNA (10 \ig) or poly(A)-selected RNA (3 ng) were submitted to Northern transfer analysis as 
30 described in Lehrach, H. et al., Biochemistry 1j5, 4743-4751 (1977). The nitrocellulose filters were hybridized 
with various ^P-labeled nick translated probes for 48 hours under stringent conditions (42°C in 5 X SSC, 50% 
formamide, 1 X Denhardt's solution). 

5. Protein analysis 

35 

Transfection of NIH3T3 cells, isolation of transformed cells, selection of G41 8-resistant colonies, metabolic 
labeling of cells with P 5 S-]-methionine, immunoprecipitation with various antisera and SDS-PAGE analysis 
were carried out as described in Martin-Zanca, D. et al., Mol. Cell Biol. 9, 24-33 (1989). The rabbit antiserum 
used to immunoprecipitate the vav proteins was raised against a synthetic 14-mer peptide 
40 (KDKLHRRAQDKKRN) whose sequence corresponds to either amino acid residues 576 to 589 of a mouse vav 
protein (Figure 1) or to residues 528 to 541 of the human vav oncogene product [Katzav, S. et al., supra] . 

B. RESULTS 

45 1. Nucleotide sequence of the mouse vav proto-oncogene 

Independent mouse cDNA libraries derived from two hematopoietic cell lines (WEHI-3 and EL-4) were used 
to isolate cDNA clones of the mouse vav proto-oncogene. WEHI-3 (ATTC TIB 68) is a myeloid cell line and 
EL-4 (ATCC TIB 39) cells were established from a mouse T-cell lymphoma. A total of 12 cDNA clones were 

so isolated. Those recombinant phages containing the longest inserts from each library (2792 Kbp from the WE- 
HI-3 and 2788 Kbp from the EL-4 cDNA library) were excised by using a helper phage, circularized and propa- 
gated in E. coli DH5 cells as plasmids. These plasmids, designated pMB24 (WEHI-3 library) and pMB25 (EL-4 
library) were subsequently submitted to nucleotide sequence analysis using standard dideoxy sequencing 
techniques as described in Sanger et al., supra. 

55 Figure 2 [SEQ. ID NO: 1] depicts the nucleotide sequence of the 2,793 bp long insert of pMB24. pMB25, 

the cDNA clone derived from EL-4 T-cell cDNA library possessed an identical sequence extending from nuc- 
leotide 5 to 2792. These results indicate that these cDNA clones are faithful representatives of normal vav tran- 
scripts in mouse hematopoietic cells. Analysis of the nucleotide sequence of pMB24 revealed a long open 
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reading frame extending from nucleotides 14 to 2597. The first in-frame ATG codon (nucleotides 14-16) is part 
of the canonical GCCACCATGG motif characteristic of efficient mammalian translations initiators [Kozak M 
Nucleic Acids Res. 15, 8125-8148, (1987)]. Analysis of mouse vav cDNA clones canying additional 5' sequ- 
ences revealed an inframe terminator codon (TGA) 45 nucleotides upstream of the beginning of the pMB24 
clone (Figure 2) [see SEQ. ID NO: 1]. Therefore, it is likely that vav protein synthesis initiates at this ATG codon. 
If so, a mouse vav proto-oncogene would code for an 844 amino acid-long polypeptide with a predicted molecu- 
lar mass of 97,303 daltons. This open reading frame is followed by a stretch of 195 bp of 3' non-coding sequ- 
ences which includes a translation^ terminator TGA (nucleotides 2598-2600) and the concensus 
polyadenylation signal AATAAA (positions 2774 to 2779) (Figure 2) [see SEQ. ID NO: 1]. Analysis of additional 
mouse vav cDNA clones carrying additional 3' sequences revealed the presence of a polyA tail just two nuc- 
leotides downstream from the end of clone pMB24. 

The predicted amino acid sequence of the putative 844 amino acid-long mouse vav protein revealed a leu- 
cine-rich domain extending from amino acid residues 33 to 102 (Figure 2) [see SEQ. ID NO: 2] . This domain 
includes a short sequence, Ala-Leu-Arg-Asp-X-Val which is also present in each of the three members of the 
myc oncogene family. This conserved motif is located within an amphipathic helix-loop-helix domain, which in 
myc proteins is required for dimerization and DNA binding [Murre, C. et al., Cell 56, 777-783 (1 989)]. This sequ- 
ence, however, is not shared by other DNA binding proteins such as Myo D1 , daughterless and one of the mem- 
bers of the achaetescute complex that exhibit similar helix-loop-helix motifs [Murre, C. et al., Cell 58, 537-544 
(1 989)]. The amino terminal leucinerich domain of the vav proto-oncogene has additional structural homologies 
with the members of the mycgene family. They include a heptad repeat of hydrophobic residues, of which three 
(four in the myc proteins) are leucines. This leucine zipperlike domain is separated from the shared Ala-Leu- 
Arg-Asp-X-Val sequence by a putative hinge region that contains two proline residues. A similar combination 
of helix-loop-helix structure followed by a heptad repeat of hydrophobic sequences has been shown to be invol- 
ved in ligand binding and dimerization of nuclear receptors [Fawell, S.E. et al., Cell 60, 953-962 (1990)]. 

Other relevant features identified in the deduced amino acid sequence of a mouse vav proto-oncogene 
product include: (i) a highly acidic 45 amino acid-long domain (residues 132-176) in which 22 residues (49%) 
are either glutamine or aspartic acid; (ii) two stretches of proline residues (positions 336 to 340 and 606 to 609) 
that may represent hinge regions; (iii) a putative protein kinase A phosphorylation site (residues 435 to 440)- 
(iv) two putative nuclear localization signals (residues 486 to 493 and 575 to 582); (v) a cysteine-rich sequence 
which includes two metal binding motifs Cys-X^Cys-X^-Cys-Xz-Cys (residues 528 to 548) and His-X 2 -Cys-Xe- 
Cys-X^His (residues 553 to 566). The former is similar to zinc finger motif found in transcriptional activators 
such as the adenovirus E1A, yeast GAL4 and certain steroid receptors [Johnson et al., Annu. Rev. Biochem. 
58 799-839 (1989)]. The overall alignment of cysteine residues in this domain (Cys-Xa-Cys-Xia-Cys-Xz-Cys- 
X^Cys-Xe-Cys) is also reminiscent of the tandem motifs found in the amino terminal domain of the various mem- 
bers of the protein kinase C family and in a diacyglycerol kinase [Coussens et a!., Science 233 859-866 (1986) 
and Sakane, F. et al., Nature 344 345-348 (1 990)]. 

2. Homology with the human vav oncogene 

Alignment of the deduced amino acid sequences of a mouse and human vav gene products reveal a 
remarkable degree of homology. The predicted mouse vav proto-oncogene sequence (amino acid residues 3 
to 844) is 91 .2% identical (769 residues) to that of its human counterpart. Of the 73 different residues, at least 
30 are conservative substitutions, thus yieldig an overall homology of 94.8% between human and murine vav 
proteins. More importantly, all of the other relevant domains previously identified in the product of the human 
vav gene, including the acidic domain, the two proline hinge regions, the putative protein kinase A phosphory- 
lation site, the cystein-rich sequence that can fold into zinc finger-like structures and the putative nuclear locali- 
zation signals, are also present in a mouse vav gene product (Figure 2) [see SEQ. ID NO: 1] . The mouse vav 
protein is one amino acid shorter (844 residues) due to the presence of a single lie™ residue instead of the 
sequence Thr 717 Val 718 found in its human counterpart. 

Comparison of a mouse vav proto-oncogene product with that of the human vav oncogene suggest that 
its 67 amino terminal amino acids were replaced by 19 unrelated residues derived from the bacterial Tn5 gene. 
Therefore, the human vav oncogene retains the carboxy-terminal moiety of the leucine-rich domain which 
includes the leucine repeat, but not the Ala-Leu-Arg-Asp-X-Val sequences shared with each of the members 
of the myc gene family. 

3. Expression of the mouse vav proto-oncogene 



It has been previously shown that the human vav proto-oncogene is specifically expressed in cells of 
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hematopoietic origin regardless of their differentiation lineage [Katzav, S. et al., supra] confirms this pattern of 
expression. As summarized in Table 1 , vav gene transcripts were identified in hematopoietic cells of myeloid 
(macrophagederived 7.1 .3 cell line), lymphoid (MOPC 31 5 plasmacytoma and A20 B-lymphocyte cell lines) and 
erythroid (Friend erythroleukemia cells, F412B2 clone) origin. The levels of vav gene expression in undrfferen- 

5 tiated mouse F412B2 cells were comparable to those present in the differentiated erythroid-like cells obtained 
by treatment of F412B2 cells with DMSO or HMBA. Similar results were obtained when human HEL and HL60 
cells were induced to differentiate along different hematopoietic lineages [Katzav, S. et al., supral . 

Northern blot analysis of RNA isolated from mouse fibroblastic cell lines failed to reveal detectable levels 
of vav gene expression (Table 1). These results were independent of the proliferative state of the cells since 

10 neither quiescent or serum-stimulated BALB3T3 cells possessed detectable vav gene transcripts. Similarly, vav 
gene expression was not found to correlate with the tumorigenic state of the cell since neither non-tumorigenic 
NIH3T3 cells or tumorigenic NIH3T3-derived y2 cells expressed detectable vav gene sequences (Table 1). 

To determine the pattern of expression of the vav proto-oncogene in vivo, RNAs were isolated from various 
mouse tissues and submitted to Northern blot analysis, vav gene transcripts were observed in spleen and lung 

15 tissues but not in brain, heart, intestine, muscle, ovaries or testes (Figure 3). Expression of the vav gene in 
spleen cells indicates that this locus is expressed in hematopoetic cells in vivo. The presence of vav gene tran- 
scripts in lung raises the possibility that this gene may also be expressed in non-hematopietic cell types. How- 
ever, lungs are known to contain high levels of infiltrating macrophages that may account for the results depicted 
in Figure 3. 

20 

4. Identification of the mouse vav proto-oncogene product 

To identify the product of a mouse vav photo-oncogene, rabbits were immunized with a peptide whose 
sequence corresponded to that of an amphilic region conserved in a mouse and human vav gene proteins 

25 (amino acid residues 576 to 589 of Figure 2) [see SEQ. ID NO: 2] . Immunoprecipitation of PS- 
methionine]-labeled extracts of PAb280, a mouse B-cell hybridoma and PMMI, a mouse T-cell hybridoma with 
this anti-vav peptide antiserum revealed various polypeptides ranging in size between 75,000 and 1 05,000 dal- 
tons. The most intense band corresponded to a protein of about 95,000 daltons, a size that corresponds well 
with that expected for the vav gene product. 

30 To establish whether this 95,000 dalton polypeptide was indeed the product of a mouse vav gene, an exp- 

ression plasmid was generated by subcloning the entire cDNA insert of pMB24 into pMEX, an eukaryoticexp- 
ression vector [Martin-Zanca, D. et al., Mol. Cell Biol. 9, 24-33 (1989)]. The resulting plasmid, designated 
pJC11, was co-transfected into NIH3T3 cells with pSVneo and colonies of G418-resistant cells were selected 
for immunoprecipitation analysis. As illustrated in Figures 4C and D, cells transfected with pJC11 DNA exr> 

35 ressed a 95,000 dalton protein indistinguishable from that present in mouse pAB280 and PMMI hybridoma cell 
lines. Moreover, immunoprecipitation of this 95,000 dalton protein was specifically blocked by preincubation 
with the immunizing peptide (Figure 4D). These results indicate that p95vav is the product of a mouse vav pro- 
to-oncogene. 

Immunoprecipitation analysis of either hematopoietic cells or vav-transfected NIH3T3 clones consistently 
40 revealed a mior protein species that migrates as a diffuse band of about 105,000 daltons. Immunoprecipitation 
of this protein could be specifically blocked by competition with the immunizing peptide. Whether this protein 
represents a modified form of $95™ or a different protein able to complex with the vav gene product awaits 
further biochemical characterization. 

45 5. Malignant activation of the vav proto-oncogene 

Transfection of NIH3T3 cells with pJC11 DNA, an expression plasmid carrying a mouse vav proto-onco- 
gene, did not revealed significant levels of morphologic transformation (Figure 5). These results suggest that 
the transforming properties of the vav oncogene might be due to the absence of the myc-related amino-terminal 

so domain and/or to the presence of the bacterial Tn5-derived sequences. To resolve this question, a truncated 
mouse vavgene was generated by deleting those nucleotide sequences of pJC1 1 DNA encompassed between 
the 5' Sal I site of the pMEX multiple cloning site and a Nrul site that lies just upstream of the second in-frame 
ATG codon (nucleotides 301 to 303 in Figure 2) [see SEQ. ID NO: 1]. The resulting plasmid, desigated pJC12, 
codes for a truncated mouse vav protein that lacks 65 of the 67 amino-terminal residues absent in the human 

55 vav oncogene product (Katzav, S. et al., supra) . Transfection of NIH3T3 cells with pJC12 DNA resulted in the 
appearance of about 3,000 foci of transformed cells per microgram of transfected DNA (Figure 5). Immunop- 
recipitation of pS-methionine]-labeled extracts of NIH3T3 cells transformed by pJC12 DNA with anti-vav pep- 
tide antibodies revealed expression of the expected 88,000 dalton protein (not shown). These results indicate 
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that truncation of the amino-terminal domain of a mouse vav proto-oncogene product can activate its trans- 
forming potential. 

The transforming activity of pJC12 DNA is at least one order of magnitude lower than that of pSK27 DNA, 
the expression plasmid containing the human vav oncogene (Figure 5). To examine whether the Tn5-derived 

5 sequences also contribute to the transforming activity of the human vav oncogene, we generated pJC7, a 
pMEX-derived expression plasmid similar to pJC11 except that the vav sequences were of human origin. Since 
the longest human vav proto-oncogene cDNA clone ends four nucleotides short of the physiological ATG 
initiator codon, translation from pJC7 DNA is likely to start in the second in-frame ATG, the initiator codon used 
by pJC12. Transfection of NIH3T3 cells with pJC25 DNA resulted in the appearance of about 40,000 foci of 

10 transformed cells per microgram of transfected DNA, a transforming activity comparable to that of the human 
vav oncogene (Figure 5). These results indicate that the Tn5-derived sequences present in the human vav 
oncogene do not contribute to its transforming activity. Moreover, they demonstrate that truncation of the amino 
terminal domain of the vav gene product is sufficient to activate its neoplastic properties. 

Finally, it was determined whether the human vav proto-oncogene possesses transforming activity. For this 

15 purpose, pJC7 was modified by adding the four nucleotides (ATGG) presumably missing in our human vav pro- 
to-oncogene cDNA clone. The resulting plasmid, pJC13, can only transform NIH3T3 cells with about 5% the 
activity of its parental clone, pJC7 (Figure 5). Analysis of NIH3T3 cells transformed by pJC13 DNA consistantly 
exhibited levels of expression of the normal p95 w,v proto-oncogene product 5- to 10-fold higher than those of 
the truncated vav protein found in cells transformed by pJC7 or pSK27 (Figure 6). These results indicate that 

20 the human vav proto-oncogene can only induce malignant transformation if overexpressed in NIH3T3 cells. 

6. Identification of a second human vav oncogene: Mechanism of activation 

A second human vav oncogene has been identified during the course of gene transfer experiments using 
25 DNAs isolated from mammary carcinomas (unpublished observations). To investigate whether this indepen- 
dently isolated vav oncogene also became activated by truncation of its amino terminus, two DNA probes were 
prepared by PCR-aided amplification of defined domains of the 5' region of pSK65, a human vav proto-onco- 
gene cDNA clone (Katzav, S. et al., supra) . The first probe is a 180 bp Eco Rl-Hinc II DNA fragment which con- 
tains the 5' end of the human vav proto-oncogene cDNA clone, a region known to be absent in its transforming 
30 allele (Figure 7A). The second probe is a 575 bp Sac l-Pst I DNA fragment that corresponds to a region located 
3' to the leucinerich domain and encompasses those sequences coding for the acidic region of the vav protein. 
As shown in Figure 7B, the 575 bp Sac l-Pst I probe recognized an internal 7 kbp Sac I fragment of normal 
human DNA which was also present in NIH3T3 cells transformed by the two independently isolated human vav 
oncogenes. In contrast, the most 5' 180 bp Eco Rl-Hinc II probe only hybridized to normal human DNA (Figure 
35 7 A). These results indicate that a second human vav oncogene identified during gene transfer of mammary 
carcinoma DNA into NIH3T3 cells, has also lost those 5'sequences coding for the amino-terminal moiety of 
the vav leucine-rich region. 

7. Contribution of the cysteine-rich domains to the biological activity of the vav gene products 

40 

The mouse and human vav gene products contain two structures that resemble metal binding domains. 
The first structure, located in residues 528-548 of a mouse $95™ protein (Figure 2), has a Cys-X^Cys-X^ 
Cys-Xa-Cys sequence pattern. This motif has been previously found in several transcriptional activators such 
as the products of the adenovirus E1a, the yeast GAL 4 and various steroid receptor genes [Johnson, P.F. et 

45 al., Annu. Rev. Biochem. 58, 799-839 (1989)]. The second structure possesses a sequence pattern (His-X^ 
Cys-Se-Cys-Xa-His) that has not be previously described. The spacing of the cysteine residues along these 
putative metal binding structure (Cys-X2-Cys-X 13 -Cys-X2-Cys-X7-Cys-Xe-Cys), is also reminescent of the phor- 
bol ester binding domain of protein kinase C [Ono, Y. et al., Proc. Natl. Acad. Sci. USA 86, 4868-4871 (1989)]. 
To test whether these structures are required for vavgene function, single point mutations were engineered 

50 in pJC12 and pJC7 DNAs that eliminated some of the conserved cystein and histidine-coding triplets. pJC12 
and pJC7, two expression plasmids capable of inducing the malignant transformation of NIH3T3 cells, provide 
a reliable biological assay to measure vav gene activity. In order to verify the presence of the desired mutation, 
each of the mutated plasmids was submitted to nucleotide sequence analysis. In addition, these plasmids were 
transfected into NIH3T3 cells to verify that they directed the synthesis of the expected vav gene products (not 

55 shown). 

As summarized in Table 2, replacement of the first or third cysteines of the metal binding-like domain by 
serine residues completely abolished the transforming activity of a mouse vav gene present in pJC12. Similar 
results were obtained when the first cysteine of the human vav gene was replaced by an arginine residue (Table 
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2). Finally, substitution of the histidine residue corresponding to the first position of a mouse His-X 2 -Cys-X6- 
Cys-X2-His motif, also abolished vav transforming activity (Table 2). This histidine residue is one of five vav 
amino acids shared by the phorbol ester domains of protein kinase C. These results indicate that the overall 
structure of the cysteinerich domain of vav gene proteins is required for their biological function. 
5 All publications and patents referred to in the present application are incorporated herein by reference to 

the same extent as if each individual publication or patent was specifically and individually indicated to be incor- 
porated by reference. 
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Expression of a mouse vav pro to -oncogene 
in cells of murine origin 



CELL LINE 



CELL TYPE 



vav GENE 
EXPRESSION 



REFERENCE 



20 



25 



30 



35 



7.1.3 
MOFC 315 
A 20 
F412B2 

F412B2 + HMBA 
NIH3T3 
NIH3T3/»J*-2 



A31 



A31 + serum 



Macrophage 

Plasmacytoma 

B lymphocyte 

Erythr o 1 eukemi a 
(undifferentiated) 

Erythr o 1 eukemi a 
( di f f erenti ated ) 

Fibroblast 

( non-tumorigenic ) 

Fibroblast 
( tumor i genie ) 

Fibroblast 
(quiescent) 

Fibroblast 

( prol i f er ating ) 



+ 
+ 



Baumbach et al . , 1987 
ATCC TIB 23 
ATCC TIB 208 
Coppola and Cole, 1986f 



Jainchill et al., 1969*" 



Mann et al . , 1983 



ATCC CCL 163 



40 



45 



a 
b 
c 
d 
e 



See legend to Figure 4 for experimental details. 
Baumbach, W.R. et al., Mol . Cell. Biol. 7, 664-671 (1987) 
Coppola, J. A. and Cole, M.D., Nature 320 , 760-763 (1986) 
Jainchill, J.L. et al., J. Virol. 4, 549-553 (1969) 
Mann, R. et al., Cell 33, 153-159 (1983) 
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TABLE 2 

Contribution of the cysteine-rich sequences to the 
Biological activity of vav gene proteins 



PLASMID 


SPECIES 


CYSTEINE MOTIF a 


TRANSFORMING ACTIVITY 
(ffu/pg DNA) 


PJC12 


Mouse 


CX 2 CX j 3 CX 2 CX4 HX 2 CX 6 CX 2 H 


450 


PJC17 


Mouse 


SX 2 CX 1 3 CX 2 CX 4 HX 2 CX 6 CX 2 H 


0 


pjcie 


Mouse 


CX 2 CX 13 SX 2 CX 4 HX 2 CX 6 CX 2 H 


0 


PJC19 


Mouse 


CX 2 CX ! 3 CX 2 CX 4 DX 2 CX 6 CX 2 H 


0 


pJC5 


Human 


CX 2 CX 13 CX 2 CX»HX 2 CX 6 CX 2 H 


5,000 


pJCIS 


Human 


RX 2 CX l3 CX 2 CX 4 HX 2 CX* CX 2 H 


0 



Cysteine motifs (residues 528 to 566) contain metal binding-like domains 
(Cys-Xz-Cys-Xj 3 -Cys-X 2 -Cys and His-X 2 -Cys-X 6 -Cys-X 2 -His ) and putative pborbol 
ester binding regions (Cys-X 2 -Cya-X 13 -Cys-X 7 -Cys-X 6 -Cys) . Substituted amino 
acid residues are bolded and underlined. 

b pSK27 DNA (see Figure 6) used as positive control in this experiment 
yielded 5,000 ffu/pg DNA. 



SEQUENCE LISTING 



NUMBER OF SEQUENCES: 2 



INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2793 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
Hii) HYPOTHETICAL: N 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 14. .2545 
(D) OTHER INFORMATION: 
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10 



15 



20 



30 



35 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GCCGGCAGCC ACC ATG GAG CTC TGG CGA CAG TGC ACC CAC TGG CTG ATC 4 9 

Met Glu Leu Trp Arg Gin Cys Thr His Trp Leu He 
15 io 

CAG TGT CGG GTG CTG CCT CCC AGC CAC CGT GTG ACC TGG GAG GGG GCC 97 
Gin Cvs Arc Val Leu Pro Pro Ser His Arg Val Thr Trp Glu Gly Ala 
15 20 25 

CAG GTG TGT GAG CTG GCA CAG GCA CTG CGG GAC GGT GTC CTC TTG TGC 145 
Gin Val Cys Glu Leu Ala Gin Ala Leu Arg Aso Gly Val Leu Leu Cvs 
30 35 40 

CAA TTG CTT AAC AAC CTG CTT CCC CAG GCC ATT AAT CTT CGC GAG GTT 193 
Gin Leu Leu Asn Asn Leu Leu Pro Gin Ala lie Asn Leu Arg Glu Val 
45 50 55 60 

AAC TTG CGG CCC CAG ATG TCC CAG TTC CTT TGT CTT AAG AAC ATT CGA 241 
Asn Leu Arg Pro Gin Met Ser Gin Phe Leu Cys Leu Lvs Asn He Arg 
65 70 75 

ACC TTC CTG TCT ACT TGC TGT GAG AAG TTC GGC CTC AAG CGC AGT GAA 269 
Thr Phe Leu Ser Thr Cvs Cys Glu Lys Phe Gly Leu Lys Arg Ser Glu 
80 85 90 



CTC TTT GAG GCT TTT GAC CTC TTC GAT GTG CAG GAC TTT GGA AAG GTC 337 

Leu Phe Glu A J a Phe Asp Leu Phe Asp Val Gin Asn Phe Glv Lvs Val 
25 95 100 105 

ATC TAC ACC CTG TCT GCT CTG TCA TGG AC A CCC ATT GCC CAG AAC AAA 38 5 

lie Tyr Thr Leu Ser Ala Leu Ser Tro Thr Pro He Ala Gin Asn Lvs 
110 115 120 



GGA ATC ATG CCC TTC CCA AC A GAG GAC AGC GCT CTG AAC GAC GAA GAT 4 33 

Gly He Met Pro Phe Pro Thr Glu Asn Ser Ala Leu Asn Aso Glu Asn 
125 130 135 140 

ATT TAC AGT GGC CTT TCA GAC CAG ATT GAT GAC ACC GCA GAG GAA GAC 4 81 

lie Tyr Ser Gly Leu Ser Aso Gin He Aso Aso Thr Ala Glu Glu Aso 
145 150 155 

GAG GAC CTT TAT GAC TGC GTG GAA AAT GAG GAG GCA GAG GGG GAC GAG 529 
Glu Aso Leu Tyr Aso Cys Val Glu Asn Glu Glu Ala Glu Glv Aso Glu 
160 165 170 

ATC TAC GAG GAC CTA ATG CGC TTG GAG TCG GTG CCT ACG CCA CCC AAG 57 7 

40 He Tyr Glu Aso L»eu Met Arg Leu Glu Ser Val Pro Thr Pro Pro Lvs 
175 180 185 

ATG AC A GAG TAT GAT AAG CGC TGC TGC TGC CTG CGG GAG ATC CAG CAG 62 5 

Met Thr Glu Tyr Aso Lys Arg Cvs Cys Cvs Leu Arg Glu lie Gin Gin 
190 195 200 

45 ACG GAG GAG AAG TAT AC A GAC ACA CTG GGC TCC ATC CAG CAG CAC TTC 67 3 

Thr Glu Glu Lys Tvr Thr Aso Thr Leu Gly Ser He Gin Gin His Phe 
205 210 215 220 

ATG AAG CCT CTG CAG CGA TTC CTT AAG CCT CAA GAC ATG GAG ACC ATC 721 
Met Lys Pro Leu Gin Ara Phe Leu Lvs Pro Gin Aso Met Glu Thr He 
50 2 2 5 2 3 0 2 3 5 

TTT GTC AAC ATT GAG GAG CTG TTC TCT GTG CAT ACC CAC TTC TTA AAG 7 69 

Phe Val Asn He Glu Glu Leu Phe Ser Val His Thr His Phe Leu -Lvs 
240 245 250 



GAA CTG AAG GAT GCC CTG GCT GGC CCG GGA GCA ACA ACA CTG TAT CAG 817 
Glu Leu Lvs Asp Ala Leu Ala Glv Pro Gly Ala Thr Thr Leu Tyr Gin 
255 260 265 
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10 



15 



20 



25 



30 



40 



VaT Phf ?T C ^ ? AC GAG ACG " C CTG GTV TAT GGC CGT TAT TGC 

270 ^ PhC LCU Val Gly La TvJ Cys 



3J E S K S Si 2 S S J2 S K K 2S 

w 355 360 



CTG GCA C AG TGC GIG AAC GAG GTC AAG AGG GAC AAT GAA ACC CTA CGC 
Leu Ala Gin Cvs Val Asn Glu Val Lys Ar* ^ SE £r Leu" Ar' 

Gin l7f t£* f* C IT CAG CTG TCC ATT GAG AAC CTG GAC CAG TCT CTG 
G1B lle Tftr Asn »• Gln *-« «« lie Glu Asn Leu Aso Gin Ser "u 
J8:> 390 395 

GCT AAC TAT GGC CGG CCC AAG AIT GAC GGT GAG CTC AAG ATT ACC TCA 
Ala Asn Tyr Gly Ar* Pro Lvs He Aso Gly Glu Leu £s ill Ser 

GTG GAG CGT CGC TCA AAG ACA GAC AGG TAT GCC TTC CTG CTC c*c 
val Glu Ar ? Ar« ser Lys T„r Aso Ara Tyr Ala "e Leu Leu Asn 

420 425 

GCA CTG CTC ATC TGT AAA CGC CGC GGG GAC TCT TAC GAC CTC AAA GCC 
Ala Leu Leu lie Cys Lys Arg Ar<j Gly Asn Ser Tyr 
35 430 435 «0 



445 4 50 455 — 

CGA GAC AAC AAG AAG VGG AGC CAT ATG TTC CTT C'-'G ait r»r p».i. 

Ar 9 Asp Asn Lvs Lvs Tro Ser His „.t Phe Leu Leu lil Glu Aso G^n 

4'/0 4?5 



961 



Ser Gin SI? ° CC AGC AAG CAC ™ a ™ C " GTG GCC ACA GCA o n 

Ser Gin Val Glu Ser Ala Ser Lys His Leu Aso Gin Val Ala Thr Ala 913 

290 295 300 

a a: k sj k 21? s: s s s» e e s; 5 s - 

310 315 

i= o°?5 s; s s 2: s E s 21 s e s s e "» 

325 330 

GCT GAA GTA CCA CCT CCT TCT CCA GGA GCT AGT GAA ACA CAC ACA cc» 
Ala Glu val Pro Pro Pro Ser Pro Gly Ala Ser S£ Thr His Thr Gly 

340 345 



1057 



1105 



1153 



1201 



1249 



1297 



1345 



77 7" 7^ aaa c^c CGC GGG GAC TCT TAC GAC CTC AAA rrr 

Ala Leu Leu lie Cys Lys Arc Ar* Gly Aso Ser Tyr Aso Leu Ala 

E ?S iZ E gj K E SK E E 21 E S E E K 1391 
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15 



GGC GCC CAG GGC TAT GAG CTG TTC TTC AAG ACT CGG GAG CTG AAG AAG 14 d 9 

Gly Ala Gin Glv Tvr Glu Leu Phe Phe Lys Thr Arg Glu Leu Lys Lys 
480 485 490 

AAG TGG ATG GAA CAG TTC GAA ATG GCC ATC TCC AAC ATT TAC CCA GAG 15 37 

Lys Trp Met Glu Gin Phe Glu Met Ala lie Ser Asn lie Tyr Pro Glu 
495 500 _ 505 

AAT GCT ACA GCC AAT GGG CAT GAT TTT CAG ATG TTC TCC TTT GAG GAG 158 5 

Asn Ala Thr Ala Asn Glv His Asp Phe Gin Met Phe Ser Phe Glu Glu 
510 515 520 

ACC ACT TCC TGC AAG GCC TGC CAG ATG TTA CTC AGA GGC ACA TTC TAC 16 33 

Thr Thr Ser Cys Lvs Ala Cys Gin Met Leu Leu Arcr Gly Thr Phe Tyr 

525 530 535 540 

CAG GGA TAT CGC TGT TAC AGG TGC CGG GCA CCT GCA CAC AAG GAG TGT 16 81 

Gin Gly Tyr Arg Cvs Tyr Arg Cvs Arg Ala Pro Ala His Lvs Glu Cys 

545 550 555 

CTG GGG AGA GTG CCT CCC TGT GGT CGC CAT GGG CAA GAT TTC GCA GGA 17 29 

Leu Gly Arg Val Pro Pro Cys Gly Arg His Glv Gin Aso Ph- Ala Gly 
560 565 57 0 

ACC ATG AAG AAG GAC AAG CTC CAT CGA AGG GCC CAG GAC AAG AAA AGG 177 7 

Thr Met Lys Lvs Aso Lys Leu His Arcr Arg Ala Gin Aso Lys bys Arq 
575 580 585 

AAT GAA TTG GGT CTG CCT AAG ATG GAA GTG TTT CAG GAA TAC TAT GGG 1825 

Asn Glu Leu Gly Leu Pro Lvs Met Glu Val Phe Gin Glu Tvr Tyr Gly 
590 595 600 

ATC CCA CCA CCA CCT GGA GCC TTT GGG CCA TTT TTA CGG CTC AAC CCT 18 7 3 

lie Pro Pro Pro Pro Glv Ala Phe Gly Pro Phe Leu Ara Leu Asn Pro 

605 610 615 620 

30 GGG GAC ATT GTG GAG CTC ACT AAG GCA GAG GCT GAG CAC AAC TGG TGG 19 21 

Gly Aso lie Val Glu Leu Thr Lys Ala Glu Ala Glu His Asn Tro Tro 

625 630 635 

GAG GGA AGG AAT ACT GCT ACA AAT GAA GTC GGC TGG TTT CCC TGT AAC 196 9 

Glu Glv Arg Asn Thr Ala Thr Asn Glu Val Glv Tro Phe Pro Cys Asn 
640 645 650 



20 



25 



35 



AGA GTG CAT CCC TAT GTC CAC GGC CCT CCT CAG GAC CTG TCT GTG CAT 2017 
Arc Val His Pro Tyr Val His Gly Pro Pro Gin Asp Leu Ser Val His 
655 660 665 
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CTC TGG TAT GCG GGC CCT ATG GAA CGA GCA GGC GCT GAG GGC ATC CTC 
Leu Trp Tyr Ala Giy Pro Met Glu Arc? Ala Glv Ala Glu Glv lie Leu 
670 675 680 

ACC AAC CGT TCT GAT GGG ACC TAT CTG GTG CGG CAG AGG GTG AAA GAT 
Thr Asn Arg Ser As© Gly Thr Tyr Leu Val Arc; Gin Arc Val Lys asd 
685 690 695 700 

ACA GCG GAG TTC GCC ATC AGC ATT AAG TAT AAC GTG GAG GTC AAG CAT 
Thr Ala Glu Phe Ala lie Ser lie Lys Tvr Asn Val Glu Val Lys His 
705 710 7*15 

ATT AAA ATC ATG ACG TCA GAG GGG TTG TAC CGG ATC ACA GAG AAG AAG 
lie Lys lie Met: Thr Ser Glu Gly Leu Tyr Arc He Thr Glu Lys Lys 
720 725 730 

GCT TTC CGG GGC CTT CTG GAA CTG GTA GAG TTT TAT CAG CAG AAT TCC 
Ala Phe Arg Gly Leu Leu Glu Leu Val Glu Phe Tyr Gin Gin Asn Ser 
735 740 745 

CTC AAA GAT TGC TTC AAG TCG TTG GAC ACC ACC TTG CAG TTT CCT TAT 
Leu Lys Asp Cys Phe Lys Ser Leu Asp Thr Thr Leu Gin Phe Pro Tyr 
750 755 760 

AAG GAA CCT GAG AGG AGA GCC ATC AGC AAG CCA CCA GCT GGA AGC ACC 

Lys Glu Pro GUu Aro Aro *ia He Ser Lvs Pro Pro Ala Glv Ser Thr 
765 770 ::s 780 

AAG TAT TTT GGC ACT GCC AAA GCC CGC TAC GAC TTC TGT GCC CGG GAC 
Lys Tyr Phe Gly Thr Ala Lys Ala Arg Tyr asd Phe Cys Ala Aro Asp 
785 790 795 

AGG TCG GAA CTG TCC CTT AAG GAG GGT GAT ATC ATC AAG ATC CTC AAT 
Arc? Ser Glu Leu Ser Leu Lys Glu Gly Ast> lie He Lvs lie Leu Asn 
800 805 810 

AAG AAG GGA CAG CAA GGC TGG TGG CGT GGG GAG ATC TAC GGC CGG ATC 
Lys Lys Gly Gin Gin Gly Tro Tro Ara Gly Glu He Tvr Gly Aro He 
815 820 825 

GGC TGG TTC CCT TCT AAC TAT GTG GAG GAA GAC TAT TCC GAA TAT TGC 
Gly Trp Phe Pro Ser Asn Tyr Val Glu Glu Aso Tvr Ser Glu Tvr Cys 
830 835 840 

TGAGCCTGGT GCCCTGTAGG ACACAGAGAG AGGCAGATGA AGGCTGAGCC CAGGATGCTA 

GCAGGGTTGA GGGGC C ATG A ACTGTCCTCA CCACGGAGGA TCTGGATGCG TGCAGATGGC 

TAGTGGCCAG CTGGCAGGGT TCCCAGGATA AAGCCCAGAG ATGC GTAATT TAT AAC AC AC 

TGATTTTCTC CAGTCCTCCA CGAAAGGTGG GGCTTGAGGC AACTGATTCT AATAAAGTGA 

GGAGbAGCA 
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INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 844 amino acids 
<B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Drotein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Glu Leu Tro Arq Gin Cys Thr His Tro Leu lie Gin Cys Ara Val 
15 10 15 

Leu Pro Pro Ser His Ara Val Thr Trp Glu Gly Ala Gin Val Cys Glu 
20 25 30 

Leu Ala Gin Ala Leu Arg Ast> Gly Val Leu Leu Cvs Gin Leu Leu Asn 
35 40 45 

Asn Leu Leu Pro Gin Ala lie Asn Leu Arc? Glu Val Asn Leu Arcj Pro 
50 55 60 

Gin Met Ser Gin Phe Leu Cys Leu Lys Asn lie Arq Thr Phe Leu Ser 
65 70 75 80 

Thr Cvs Cvs Glu Lvs Phe Glv Leu Lys Arc Ser Glu Leu Phe Glu Ala 
85 90 95 

Phe Aso Leu Phe Ast> Val Gin Aso Phe Glv Lys Val lie Tvr Thr Leu 
100 105 110 

Ser Ala Leu Ser Tro Thr Pro He Ala Gin Asn Lys Gly He Met Pro 
115 120 125 

Phe Pro Thr Glu Aso Ser Ala Leu Asn Aso Glu Aso lie Tvr Ser Glv 
130 135 140 

Leu Ser Aso Gin lie Aso Aso Thr Aia Giu Giu Aso Glu Aso Leu Tvr 
145 150 155 160 

Aso Cys Val Glu Asn Glu Glu Ala Glu Glv Aso Glu He Tvr Glu Aso 
165 170 175 

Leu Met Arg Leu Glu Ser Val Pro Thr Pro Pro Lvs Met Thr Glu Tvr 
180 185 190 

Aso Lys Arg Cvs Cvs Cvs Leu Arp Glu He Gin Gin Thr Glu Glu Lvs 
195 200 205 
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Tyr Thr Aso Thr Leu Giy Ser He Gin Gin His Phe Met Lvs Pro Leu 
210 215 220 

Gin Arp Phe Leu Lvs Pro Gin Aso Met Glu Thr He Phe Val Asn He 
225 230 235 240 

Glu Glu Leu Phe Ser Val His Thr His Phe Leu Lvs Glu Leu Lys Aso 
245 250 255 

Ala Leu Ala Gly Pro Gly Ala Thr Thr Leu Tyr Gin Val Phe He Lvs 
260 265 270 

Tyr Lys Glu Arg Phe Leu Val Tvr Glv Arp: Tyr Cvs Ser Gin Val Glu 
275 280 285 

Ser Ala Ser Lys Bis Leu Aso Gin Val Ala Thr Ala Arg Glu Aso Val 
290 295 300 

Gin Met Lys Leu Glu Glu Cvs Ser Gin Arg Ala Asn Asn Glv Ara Phe 
305 310 315 320 

Thr Leu Arg Ser Ala Aso Gly Thr Tyr Ala Ala Glv Ala Glu Val Pro 
325 330 335 

Pro Pro Ser Pro Gly Ala Ser Glu Thr His Thr Glv Cys Tvr Arg Glu 
3^0 345 350 

Gly Glu Leu Arg Leu Ala Leu Asp Ala Met Arg Asd Leu Ala Gin Cvs 
355 360 365 

Val Asn Glu Val Lvs Arg Aso Asn Glu Thr Leu Arq Gin He Thr Asn 
370 375 380 

Phe Gin Leu Ser He Glu Asn Leu Aso Gin Ser Leu Ala Asn Tvr Glv 
385 3 90 395 400 

Arg Pro Lys He Aso Glv Glu Leu Lvs He Thr Ser Val Glu Arg Aro 
405 410 415 

Ser Lvs Thr Aso Arg Tyr Ala Phe Leu Leu Aso Lvs Ala Leu Leu He 
420 425 430 

Cys Lys Arg Arg Glv Aso Ser Tyr Aso Leu bvs Ala Ser Val Asn Leu 
435 440 445 

his Ser Phe Gin Val Ser Aso Aso Ser Ser Glv Glu Aro aso Asn Lvs 
450 455 460 

Lys Trp Ser His Met Phe Leu Leu He Glu Aso Gin Glv Ala Gin Gly 
465 4 70 475 480 

Tyr Glu Leu Phe Phe Lvs Thr Arg Glu Leu Lvs Lvs Lvs Trp Met Glu 
485 490 495 

Gin Phe Glu Met Ala He Ser Asn He Tyr Pro Glu Asn Ala Thr Ala 
500 505 510 
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Asn Gly His Asp Phe Gin Met Phe Ser Phe Glu Glu Thr Thr Ser Cvs 
515 520 525 

Lys Ala Cys Gin Met Leu Leu Arg Glv Thr Phe Tyr Gin Glv Tvr Aro 
5 530 535 540 

Cys Tyr Arg Cvs Arg Ala Pro Ala His Lvs Glu Cys Leu Gly Arg Val 
54 5 550 555 560 

Pro Pro Cys Gly Arg His Gly Gin Asx> Phe Ala Glv Thr Met Lys Lvs 
10 565 570 575 

Aso Lys Leu His Arg Arg Ala Gin Ast> Lvs Lvs Arg Asn Glu Leu Glv 
580 585 590 

Leu Pro Lys Met Glu Val Phe Gin Glu Tvr Tvr Glv lie Pro Pro Pro 
15 595 600 605 

Pro Gly Ala Phe Gly Pro Phe Leu Arg Leu Asn Pro Gly Aso lie Val 
610 615 620 

Glu Leu Thr Lys Ala Glu Ala Glu His Asn Trp Trp Glu Gly Arg Asn 
20 625 630 635 640 

Thr Ala Thr Asn Glu Val Gly Trp Phe Pro Cys Asn Arg Val His Pro 
645 650 655 

Tyr Val His Gly Pro Pro Gin Asp Leu Ser Val His Leu Trp Tyr Ala 
25 660 665 670 

Gly Pro Met Glu Arg Ala Gly Ala Glu Glv lie Leu Thr Asn Arg Ser 
675 680 685 

Asp Gly Thr Tyr Leu Val Arg Gin Arq Val Lvs Asp Thr Ala Glu Phe 
30 690 695 700 

Ala lie Ser lie Lvs Tvr Asn Val Glu Val Lvs His He Lys He Met 
7 05 710 715 720 

Thr Ser Glu Glv Leu Tyr Arg He Thr Glu Lvs Lvs Ala Phe Arg Gly 
35 725 " 730 735 

Leu Leu Glu Leu Val Glu Phe Tyr Gin Gin Asn Ser Leu Lys Aso Cys 
740 745 750 



40 
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Phe Lys Ser Leu Asp Thr Thr Leu Gin Phe Pro Tvr Lvs Glu Pro Glu 
755 760 765 

Arg Arg Ala lie Ser Lys Pro Pro Ala Glv Ser Thr Lys Tyr Phe Gly 
770 775 780 

Thr Ala Lys Ala Arg Tyr Asp Phe Cvs Ala Ara Asp Arg Ser Glu Leu 
785 790 795 800 

Ser Leu Lys Glu Gly Asp He He Lys lie Leu Asn Lvs Lys Gly Gin 
805 810 815 

Gin Gly Trp Trp Arg Glv Glu He Tvr Glv Arg He Glv Trp Phe Pro 
820 825 830 

Ser Asn Tyr Val Glu Glu Asp Tvr Ser Glu Tvr Cvs 
835 840 



55 



Claims 



1. An isolated nucleic acid molecule comprising a nucleic acid sequence coding for all or part of a mouse 
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vav proto-oncogene protein or for a modified mouse vav proto-oncogene protein. 

2. The nucleic acid molecule according to Claim 1 which is a DNA molecule and wherein the nucleic acid 
sequence is a DNA sequence. 

3. The DNA molecule according to Claim 2 wherein the DNA sequence has the nucleotide sequence sub- 
stantially as shown in Figure 2 [SEQ. ID No: 1], 

4. The DNA molecule according to Claim 2 wherein the DNA sequence has part of the nucleotide sequence 
substantially as shown in Figure 2 [SEQ. ID No: 1]. 

5. A DNA molecule having a DNA sequence which is complementary to the DNA sequence accordinq to 
Claims 3 or 4. 

6. An expression vector comprising a DNA sequence coding for all or part of a mouse vav proto-oncogene 
protein or for a modified mouse vav proto-oncogene protein. 

7. The expression vector according to Claim 6 comprising one or more control DNA sequences capable of 
directing the replication and/or the expression of and operaHvely linked to the DNA sequence coding for 
all or part of a mouse vav proto-oncogene protein or for a modified mouse vav proto-oncogene protein. 

8. The expression vector according to Claim 6 wherein the DNA sequence coding for all or part of a mouse 
vav proto-oncogene protein orfor a modified mouse vav proto-oncogene protein has the nucleotide sequ- 
ence substantially as shown in Figure 2 [SEQ. ID No: 1]. 

9. The expression vector according to Claim 6 wherein the DNA sequence coding for all or part of a mouse 
vav proto-oncogene protein orfora modified mouse vav proto-oncogene protein has part of the nucleotide 
sequence substantially as shown in Figure 2 [SEQ. ID NO: 1J. 

10. The expression vector according to Claim 6 designated pMB24. 

11. An expression vector having the identifying characteristics of the expression vector according to Claim 

12. A prokaryotic or eukaryotic host cell containing the expression vector according to any one of Claims 6 to 

13. A method for producing a polypeptide molecule which comprises all or part of a mouse vav proto-oncogene 
protein ora modified mouse vav proto-oncogene protein comprising culturing a host cell according to Claim 
1 2 under conditions permitting expression of the polypeptide molecule. 

14. A method for detecting a nucleic acid sequence coding for all or part of a mouse vav proto-oncogene pro- 
tein or a related nucleic acid sequence comprising contacting the nucleic acid sequence with a detectable 
marker which binds specifically to at least part of the nucleic acid sequence, and detecting the marker so 
bound, the presence of bound marker indicating the presence of the nucleic acid sequence. 

15. The method according to Claim 14 wherein the nucleic acid sequence is a DNA sequence. 

16. The method according to Claim 14 wherein the nucleic acid sequence is an RNA sequence. 

17. The method according to Claim 15 wherein the DNA sequence has the nucleotide sequence substantially 
as shown in Figure 2 [SEQ. ID No: 1]. 

18. The method according to Claim 15 wherein the DNA sequence has part of the nucleotide sequence sub- 
stantially as shown in Figure 2 [SEQ. ID No: 1]. 

19. The method according to any one of claims 14 to 18 wherein the detectable marker is a nucleotide sequ- 
ence complementary to at least a portion of the nucleic acid sequence. 
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20. The method acording to Claim 19 wherein the nucleotide sequence is a complementary DNA sequence. 

21. The method according to Claim 15 wherein the DNA sequence is a genomic DNA sequence. 

5 22. The method according to Claim 16 wherein the RNA sequence is a messenger RNA sequence. 

23. An isolated polypeptide molecule comprising all or part of a mouse vav proto-oncogene protein or a mod- 
ified mouse vav proto-oncogene protein. 

10 24. An isolated polypeptide molecule encoded by the DNA sequence according to Claim 2. 

25. The polypeptide molecule according to Claim 23 having the amino acid sequence substantially as shown 
in Figure 2 [SEQ. ID NO: 2]. 

15 26. The polypeptide molecule according to Claim 23 having part of the amino acid sequence substantially as 
shown in Figure 2 [SEQ. ID NO: 2]. 
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